PS312 Statistical Research Methods
Substantive
Causality
Mechanisms
Confounders
Technical
Statistical Tests
Regressions
Diagnostics
Welch Two Sample t-test
data: norm_x and norm_y
t = -55.188, df = 17450, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-2.059542 -1.918262
sample estimates:
mean of x mean of y
0.9928849 2.9817871
set.seed(123) # set the seed for reproducibility
norm_x = rnorm(n = 10000, mean = 1, sd = 3) # generate data with different properties
norm_y = rnorm(n = 10000, mean = 3, sd = 2) # generate data with different properties
his = ggplot() +
geom_histogram(aes(x = norm_x, fill = "Distribution X"), alpha = 0.5) +
geom_histogram(aes(x = norm_y, fill = "Distribution Y"), alpha = 0.5) +
geom_vline(xintercept = mean(norm_x), color = "red") +
geom_vline(xintercept = mean(norm_y), color = "blue") +
labs(x = NULL,
y = NULL,
fill = NULL) +
theme_bw()
box = ggplot() +
geom_boxplot(aes(x = norm_x, y = "X", fill = "Distribution X"), alpha = 0.5) +
geom_boxplot(aes(x = norm_y, y = "Y", fill = "Distribution Y"), alpha = 0.5) +
labs(x = NULL,
y = NULL,
fill = NULL) +
theme_bw() | ID | X |
|---|---|
| 1 | 34 |
| 2 | 22 |
| 3 | 19 |
| 4 | 85 |
| ID | Y |
|---|---|
| 1 | Blue |
| 2 | Red |
| 4 | Green |
| 4 | Yellow |
| Library | Functions | Description |
|---|---|---|
| tidyverse | filter(), mutate(), ggplot() |
data wrangling and visualization |
| modelsummary | modelsummary() |
present good looking tables |
| ggeffects | ggpredict() |
calculate and visualize marginal effects |
| Library | Functions | Description |
|---|---|---|
| GGally | ggpairs(), ggcoef() |
extension to ggplot |
| ggfortify | autoplot() |
extension to ggplot for diagnostics |
| lmtest | bptest() |
statistical tests for diagnostics |
| car | vif() |
additional statistical tests for diagnostics |